Transformation-Based Learning meets Frequent Pattern Discovery

نویسنده

  • Luc Dehaspe
چکیده

Transformation-based learning (TBL) and frequent pattern discovery (FPD) are two popular research paradigms, one from the domain of empirical natural language processing , the second from the eld of data mining. This paper describes how Eric Brill's original TBL algorithm can be improved via incorporation of FPD techniques. The algorithm B-Warmr is presented that upgrades TBL to rst-order logic and speeds up the search for transformations, also in the original propositional case. We demonstrate some scaling properties of B-Warmr and discuss how the algorithm can be tuned to generate ((rst-order) decision lists. We also propose a new method, Disjunctive Transformation-Based Learning (DTBL) that combines the advantages of TBL and decision lists.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dataset Filtering Techniques in Constraint-Based Frequent Pattern Mining

Many data mining techniques consist in discovering patterns frequently occurring in the source dataset. Typically, the goal is to discover all the patterns whose frequency in the dataset exceeds a userspecified threshold. However, very often users want to restrict the set of patterns to be discovered by adding extra constraints on the structure of patterns. Data mining systems should be able to...

متن کامل

The Discovery of Frequent Patterns with Logic and Constraint Programming

The basic goal of data mining is to discover patterns occurring in the databases, such as associations, classification models, sequential patterns, and so on. In this paper we focus on the problem of frequent pattern discovery, which is the process of searching for patterns such as sets of features or items that appear in data frequently. Such frequent patterns can reveal associations, correlat...

متن کامل

Discovery of Frequent Episodes in Event Logs

Lion’s share of process mining research focuses on the discovery of end-to-end process models describing the characteristic behavior of observed cases. The notion of a process instance (i.e., the case) plays an important role in process mining. Pattern mining techniques (such as frequent itemset mining, association rule learning, sequence mining, and traditional episode mining) do not consider ...

متن کامل

Automata Theory Approach for Solving Frequent Pattern Discovery Problems

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-...

متن کامل

Approximate Frequent Pattern Discovery Over Data Stream

Frequent pattern discovery over data stream is a hard problem because a continuously generated nature of stream does not allow a revisit on each data element. Furthermore, pattern discovery process must be fast to produce timely results. Based on these requirements, we propose an approximate approach to tackle the problem of discovering frequent patterns over continuous stream. Our approximatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999